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S everal issues surround the continuing implementation of the Chance 
and Data component of the mathematics curriculum in Australia. First 
is its survivail. Second is the question of how far towaird formail inference 
should the curriculum take students (assuming it survives). Third is what 
contexts are amenable for understanding the concepts in the curriculum. 
Fourth is what tools are available to save time and assist in the leairning 
process. One of the ways of ensuring survival is to convince decision makers 
that Chance and Data can be taught and learned successfully. 

As is probably true of other parts of the mathematics curriculum, there 
is sometimes a tendency in Chance and Data to focus on small components, 
without spending time to fit them into the overall picture of handling data 
to answer questions and draw conclusions. Playing games with dice might 
be fun but how does the activity lead to answering meaningful questions? 
Finding the mean of a set of numbers might be good practice in addition 
and division but what does it convey about the set of numbers and how can 
it be useful in answering a question? Drawing a graph might create an 
attractive, colourful picture but what story is told about a data set, its vari- 
ation and its clusters of values? Statistics is about telling stories and 
answering questions based on various types of data. For statisticians the 
questions involve collecting samples from populations and drawing infer- 
ences about the latter from the former, usually based on random selection. 
For school students statistics is likely to be more what Tukey (1977) called 
exploratory data analysis, perhaps answering questions limited to their own 
experience on a known population from which a convenience, rather than 
random, sample is drawn. One of the aims across the middle years of school 
should be to provide students a pathway for asking questions about popu- 
lations within which they see themselves as members. This pathway is 
signposted with the techniques, such as finding middles, drawing repre- 
sentations, and describing variation, which assist in telling stories and 
answering questions. Associated with these techniques there are now soft- 
ware packages that will ease the computational burden and provide visual 
representations to make decision making more intuitive than in the past. 
The packages can change the focus from performing computations to inter- 
preting and explaining. Experiencing this process is part of informal 
inference, which will lay the foundation for formal inference in later years. 
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This article uses a familiar setting to explore the issues associated with 
developing ideas of informal inference and introduces the software package, 
TinkerPlots (Konold & Miller, 2005), as a tool to facilitate this development. 
Those wishing to follow up on information about TinkerPlots can download 
a trial version at www.keypress.com. An evaluation of TinkerPlots as an 
educational data analysis package is provided by Fitzallen (2007). The value 
of one representation provided in TinkerPlots, the hat plot, is explored in 
detail by Watson, Fitzallen, Wilson, and Creed (in press). 

The activities suggested in this article are intended for use with middle 
and secondary students (grades 6 to 10). It is acknowledged, however, that 
teachers in a school might need to work together to gain an appreciation of 
the expected development of understanding and plan for the background 
and level of the students they teach. The data and suggestions presented 
here have arisen mainly from workshops with inservice middle school 
teachers and preservice primary teachers, and hence may provide models 
for similar sessions, as well as for activities in the classroom. Examples of 
student work from grade 7 are also included. 

The context chosen for the investigations is body measurement. 
Activities based on measuring hand span, foot length, arm span, and height 
have been described by others (e.g., Clarke, 1996; Lovitt & Clarke, 1992) 
and the famous drawing by Leonardo da Vinci of the Vitruvian Man is often 
used as a motivation for asking a question about arm span equalling height. 
The recent Australian Bureau of Statistics CensusAtSchool survey asked 
students for measurements of the height of their belly button from the floor, 
the length of their right foot, and their totail height; these measurements 
hence provide an excellent data base from which random samples can be 
collected (“2006 CensusAtSchool Questionnaire”, 2006). 

When planning a unit of work that aims to develop ideas associated with 
informal inference, the starting point and questions need to be considered 
carefully. Some mathematics educators, for example, would suggest begin- 
ning with the da Vinci drawing and asking a question about the population 
at large: Do you think it is true for all the people in the world that their arm 
span lengths are equal to their heights? Discussion would evolve into how 
this question could be answered, with suggestions about appropriate kinds 
of data collection. Most students will be interested in checking themselves 
and collecting data from their classmates. Most high school teachers would 
assume that collecting these data will lead to the production of a scatterplot 
with arm span measured on one axis and height on the other. Jumping 
straight into this type of investigation may be appropriate for students with 
some previous experience in data handling and graphing (perhaps in grade 
9 and above) but for younger students it seems more appropriate to begin 
with a less complex scenario in terms of the data handling expectations. 
Thus, even though the question that begins with a population is quite easy 
to understand, the techniques required to provide an answer may be rela- 
tively sophisticated. 

A less demanding approach for students may be to start with a measure- 
ment activity, asking how accurately the arm span of a particular member 
of the class (or the teacher) can be measured (Konold & Pollatsek, 2002; 
Shaughnessy, 2006). In this way, questions of accuracy and variation can 
be introduced: What does it mean to make an accurate measurement? What 
variation can we expect in a measurement? Why is accuracy important? 
These questions can in fact be considered in a nairrow classroom context, 
such as suggested here or expanded to consider wider social or scientific 
contexts. From this initial investigation, students can be encouraged to 
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think about what a typical arm span measurement for a particular age or 
grade level might be, before compairing the arm span measurements of two 
groups, such as boys and girls. Finally, students can investigate the asso- 
ciation of two variables, arm spam and height, and draw conclusions from 
this. The questions discussed in the previous paragraph about a larger 
population can now be explored with more confidence. The following few 
investigations provide examples of how investigations might proceed. 


Investigation 1 : Measuring accurately 

Four questions similar to the following can be used to begin an investiga- 
tion of accuracy in measurement. 

1 . What does it mean to make an accurate measurement? 

2. What variation can be expected if a measurement is repeated? 

3. Why is accuracy important? 

4. How confident can we be that we have the “true” measurement? 
Although these questions appear to be about measuring, not statistics, 

statistics can be used to help answer them. The following steps in the inves- 
tigation provide starting points for teachers to adapt for their classes. 

Setting the question 

How accurately can the arm span of a person be measured? What method 
should be used? What would be a reasonable estimate? 

Discussion of various methods of measuring is likely to be a good place 
to begin to answer the question. Why might more than one measurement be 
needed? All students in the class can contribute by suggesting how they 
would make the measurement. Would students expect all measurements to 
be the same? Issues might include whether a person would stand or lie on 
the floor, what instruments would be used to make the measurements, and 
what accuracy of measurement should be recorded. 

Data coiiection (intervai data) 

Each person measures the arm span of a single selected person (say, with 
arms spread out, to nearest 0.5 cm). Discussion can focus on how many 
measurements would be needed for a good estimate of the actual value. 

Table 1 . Example of how the data can be recorded. 


No. 

Measurer's Name 

Arm Span Length 

No. 

Measurer's Name 

Arm Span Length 

1 



11 



2 



12 



3 



13 



4 



14 



5 



15 



6 



16 



7 



17 



8 



18 



9 



19 



10 



20 
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Figure 1 . Example of a TinkerPlots 
data card and a stacked dot plot 
showing the arm span length of a 
single person measured by 14 
different people. 



Representing data 

The collected data can be listed and ordered in a table similar to the one in 
Table 1 and initial discussion based on values observed in the table. What 
is the largest value measured? What is the smallest value? Are any values 
repeated? Students can enter the data on TinkerPlots data cairds and create 
a graphical representation for the measurements. Figure 1 displays an 
example of a stacked dot plot. 

Summarising data 

Using the tools available in TinkerPlots, students can mark the mean, 
median, mode, and range on the line plot. Any interesting features can then 
be discussed. Are any of the averages the same? Are there any outliers? Can 
they be explained? In Figure 1 for example, the vailues of 186 cm and 187 
cm were measured by one person with a shorter ruler than the other people 
used and by another person who measured “over Nathan’s body” rather 
than flat on the floor under him. 

Constructing a hat plot in TinkerPlots is often helpful in summarising a 
data set. A default hat plot covers the middle 50% of the data values under 
its crown and the bottom and top 25% under its brims. Does the hat plot 
help describe the spread of the data? Figure 2 shows a hat plot for the plot 
in Figure 1 with the data remaining visible. How does the graph help answer 
the question about how accurately the arm span can be measured? What is 
the best estimate of the selected person’s arm span length from the data 
collected? Looking at the crown of the hat should help narrow the value of 
the estimated arm span without forcing the choice of a single value, such 
as the mean or median. 



Figure 2. Example of a 
Flat Plot showing the 
spread of Information 
when measuring arm 
span length. 
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Chance and sampling questions 


How could a better estimate of the selected person’s arm span be obtained? 
Discussion could focus for example on selecting another sample of 
measures, collecting a larger sample of measurements, or using a more 
consistent measuring instrument or technique. 

Ways of randomly choosing a saimple of measurements for this problem 
could be discussed. Would the same data set, mean, median or mode be 
obtained each time? How chance selection of a sample from a much larger 
set of measurements might affect the mean, or other values, could be an 
interesting topic of discussion. 

Drawing a conclusion 

Students should finally write a summary report, including all of the 
assumptions made, to explain how accurately the group measured the arm 
span of a single person and what the best estimate is. Decisions about the 
potential outliers and their inclusion in or exclusion from the analysis need 
to be included in the report. Suggestions for further investigation are also 
valuable to include. This report can be written in a text box in Tinker Plots 
to include with the plots created or the plots can be copied and pasted into 
Word documents. An “informal” inference reached should include a “best” 
estimate for the person’s arm span, perhaps expressed as a range and with 
some statement about the degree of confidence with which the estimate is 
made. 


Investigation 2: Measuring arm spans of a group 

A natural progression from Investigation 1 is to consider the typical arm 
span of a class of students, or of students of a certain age. Students should 
have a feel for the accuracy of their individual measurements (and may 
want to take several measurements and average them in some way). The 
following steps suggest a possible pathway. 

Setting the question 

What is the typical arm span measurement of grade X students? 

Data collection (interval data) 

Students should discuss how their pairticular class can contribute to 
answering the rather general question. After discussing and deciding on a 
method of measurement, the next issue is how many measurements would 
be needed for a good estimate of the typical arm span length for the class. 
All students then have their airm spams measured (say, with arms spread 
out, to nearest 0.5 cm). 

Representing data 

The next step is to create a table similar to Table 1 for students to record 
their data (or add to the previous data set, perhaps as “My armspan”). This 
information can be entered into Tinker Plots by each student and represen- 
tations created for the class data. Students should be given freedom to 
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create their own preferred graphicail form. Again there may be occasion to 
discuss outliers if there are some very unusual measurements recorded. 


Summarising data 


Students should be asked to fill in a text box in TinkerPlots summarising 
what their representation tells them about the typical arm span of their 
class. This may involve discussion of the mean, median, mode, or range as 
found on their graphs. They may use the hat plot to discuss the shape of 
the data and the variation in the data. Figure 3 shows an exaimple. If 
Investigation 1 has preceded Investigation 2, a discussion point would be 
the difference in the variation shown in the two stacked dot plots in Figures 
2 and 3. Why would the second be expected to show more variation? 


Qillaclion 1 | 

Figure 3. Example of a 
Hat Plot showing the 
shape of and variation 
in the data when 
measuring the arm 
span lengths of a 
group of people. 



Chance 


Students should discuss ways of randomly choosing a sample for answering 
this question. Perhaps there are other grade X classes in the school that 
could be measured. How would chance and a different sample affect the 
mean, the median, the variation, and the shape of the hat plot? 

Drawing a conclusion 

Students should then write a report, complete with graphs, including all of 
the assumptions made, to explain how the class arrived at its estimate of a 
t3q3ical arm span length for grade X students and to indicate its degree of 
confidence in the estimate. 


Investigation 3: 

Comparing measurements on two groups 

A natural extension of Investigation 2 is to ask a question that compares 
two groups, perhaps boys and girls, or students in different grades. 
Questions to consider might be: Do boys have greater arm spans than girls? 
Do students across the middle years have increasing arm spans with higher 
grades? To make formal inferences about these questions for a state or 
country would require random samples and advanced techniques but much 
can be learned about the processes in the informal inference arena. 

The data collection and representation tasks would be similar to 
Investigations 1 and 2. As an example. Figure 4 shows a portion of a 
TinkerPlots table with data from 58 students in grades 5 to 8 with gender 
also included. Two interesting comparisons are possible from this data set. 
Figure 5 shows the stacked dot plots for the boys and girls in the middle 
years, whereas Figure 6 shows the stacked dot plots for the grades. 
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Figure 4. An example of a TinkerPlots Figure 5. Stocked dot plot of arm span length by gender, 

table showing data from students in 
grades 5 to 8. 


For this data set some 
very interesting observa- 
tions about differences in 
variation as well as typical 
arm span can be made. The 
students in this school, for 
example, concluded that the 
variation in arm spans of 
boys in the middle school 
was greater than the varia- 
tion in the arm spans of 
girls in the same grades. 

They also concluded that 
arm span increased from 
grade 5 to grade 6 and from 
grade 6 to grade 7, but then 
levelled off, probably related 
to growth spurts up to grade 7. Including hat plots in the graphical repre- 
sentations further enhances the discussion of “middles” and spread. For 
this school as the population, the students could make definitive state- 
ments about the data sets, to answer the questions and speculate about 
causes; but for a larger population, they would have to reach informal infer- 
ences and acknowledge uncertainty. 
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Investigation 4: 

Comparing measurements on two variables 

An extension to Investigation 3 is to ask a question about the association 
between two variables within the same group, namely arm span length and 
height. In this investigation, students can be introduced (or reintroduced) 
to da Vinci’s Vitruvian Man and asked to consider the questions: Is there an 
association between people’s arm spans and their heights? Are they the 
same or nearly the same? Is there a “cause” of the association? 

Data collection can involve students measuring their heights (say, with 
shoes off, to nearest 0.5 cm) and their arm spans (if not measured before). 
These data need to be recorded, perhaps on a whiteboard or worksheet, in 
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Figure 7. Data card and graph 
showing the association between 
arm span and height in a group of 
preservice teachers. 



a manner that makes it easy for students to enter them into new data cards 
in TinkerPlots (or into the data cards from Investigation 1). 

If they have the appropriate background, students can then produce 
association graphs such as the one represented in Figure 7, with height on 
one axis and arm span length on the other. Older students can also use a 
calculator and see if there is 
a significant correlation 
between the two attributes. 

Summarising the data 
may involve finding the 
mean of the data on each 
axis and discussing any 
outliers. Using the drawing 
tool in TinkerPlots, it is 
possible to draw a “line of 
best fit” showing the associ- 
ation between height and 
arm span (an exaimple of 
this is shown in Figure 8). 

Younger students who 
may not be familiar with 
scattergraphs may suggest 
subtracting arm span length 
from height to see if the results are zero or close to zero. This can be easily 
done using a special formula in TinkerPlots that provides the difference 
between the two attributes. Figure 9 contains the formula box showing how 
this can be achieved. The difference can then be represented graphicailly, as 
in Figure 10. Of interest in Figure 10 is how many differences (My Height - 



Figure 8. The "line of best fit" for arm span length and height 
in a group of preservice teachers. 
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Figure 9. Data card and formula box showing how to find the 
difference between arm span length and height. 
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My Armspan) are equal to 
zero, positive, or negative. 
What is a reasonable differ- 
ence from zero that students 
could observe and still be 
able to say that the two 
measurements are “roughly 
the same”? The hat plot 
might be useful here. There 
is no definitive answer and 
again it might be necessary 
to check for (and explain) 
outliers. 

Other students, perhaps 
at an age between those who 
would subtract and those 
who would draw a scatter- 
graph, might suggest 
dividing one measurement 
by the other and seeing how 
close the ratios are to one. 
This idea is of course related 
to how close the points on a 
scattergraph lie to the 
straight line drawn at 45° 
from the origin. Figure 1 1 
shows the formula box and 
what the associated graph 
would look like for the 
preservice teachers’ data. 

Using a TinkerPlots text 
box, students can write a 
report, setting the context 
for the question, answering 
the question, and explaining 
how the analysis was 
carried out for their infer- 
ences. They can also 
speculate on the “cause” of 
this association, being 
careful to use probabilistic 
rather than declarative 
language. It would be inter- 
esting in a class if different 
groups of students 
presented these three repre- 
sentations (or others) and 
their associated arguments 
to answer the question. 
Students could discuss 
which was the most 
convincing. 
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Figure 10. Graph showing 
the difference between 
arm span iength and 
height in a group of 
preservice teachers. 
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Attributes are the names you can use in expressions. They 
refer to attributes in a collection. 
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Figure 1 1 . Formuia box for the ratio of height to arm span and 
the associated graph. 
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Conclusion 


The purpose of this article has been to motivate teachers to present their 
students with meaningful investigations that lead to an appreciation of the 
t}^es of questions that informal inference can help to answer. The chance 
and data curriculum is about much more than finding averages and 
drawing graphs. All three averages found in curriculum documents — 
mean, median and mode — can be illustrated in these investigations. In 
Figure 1, for example, the median and the mode are both 182, whereas the 
mean ranges from 182.6 to 182 depending on whether the two potential 
outliers are included or not. These observations should not, however, be the 
only focus of Investigation 1: variation observed, reasons for it, and conse- 
quent qucilified statements about accuracy are essential to a meaningful 
report. Although these investigations could be carried out without the use 
of Tinker Plots, the package can save time and add creativity and student 
ownership to the production of evidence and the creation of a final report 
answering the initial questions. 
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